Introduction to Time Series Analysis - 01

This note is for course MATH 545 at McGill University.

Lecture 1 - Lecture 3

Reference Book

Introduction to Time Series and Forecasting (by Brockwell and Davis)
The Analysis of Time Series: an Introduction with R (by Chatfield and Xing)

Time series

$\{X_t\}$ is a collection of random variables where $t$ is the index of time.

The process of dealing with time series

Describe by plotting to have concise summary of data
Explain by probabilistic models (joint distributions)
Predict to attain more uncertainty

Note that $X_t$ is mutual independent, so we have the joint distribution $Pr(X_1 \leq x_1, X_2 \leq x_2, ..., X_n \leq x_n) = \prod_{i=1}^n Pr(X_i \leq x_i)$ . But for most complex models we assume that $Pr(X_1 \leq x_1, X_2 \leq x_2, ..., X_n \leq x_n) = \\ Pr(X_1 \leq x_1)Pr(X_2 \leq x_2 | X_1 \leq x_1) ... Pr(X_n \leq x_n|X_1 \leq x_1 ... X_{n-1} \leq x_{n-1})$ .

Semi-parametric model

In semi-parametric models, we do not specify pdf and cdf of random variables, instead we specify $E(X_t)$ and $Cov(X_t, X_{t+j})$ .

Examples

iid noise: let $E(X_t)=0, \forall t$ and $Pr(X_1 \leq x_1, X_2 \leq x_2, ..., X_n \leq x_n) = \prod_{i=1}^n Pr(X_i \leq x_i) = \prod_{i=1}^n F(X_n)$ where $F(\cdot)$ is cumulative distribution function.
random walk: let $\{X_t\}$ be iid noise, and $S_t = X_1 + X_2 + ... + X_t$ . Here $S_t$ is a random walk. (Note that here $S_t$ are not independent, but $E(S_t)=0$ )

Models with structures

Let $\{Y_t\}$ be a time series where $E(Y_t)=0, \forall t$ . Let $X_t=m_t + Y_t$ , where $m_t$ is a slowly changing function of time. (Note that $Y_t$ here is what makes $E(X_t)=0$ , but $m_t$ here is what makes $E(X_t) \neq 0$ )

Some choices for $m_t$ including linear function of $t$ and polynomial function of $t$ .

Models with seasonal variation (periodicity)

Let $X_t = S_t + Y_t$ , where $E(Y_t)=0, \forall t$ and $S_t$ is a periodic function with period $d$ (i.e. $S_{t-d}=S_t$ ).

Common choices for $S_t$ including sum of harmonic functions $S_t = a_0 + \sum^k_{j=1}(a_j \cos(\lambda_j t) + b_j\sin(\lambda_j t))$ , where $a_j$ and $b_j$ are estimated, $\lambda_j$ are fixed frequencies.

General strategy for analysis

Plot the data to
1. identify potential signal (trend, seasonal)
2. identify possible models for the rsifual process
3. identify outliers and other weird things
Remove the signal
Choose a model to fit the resifual and esitimate the dependence
Forecast by inventory projected residuals

Why we focus on the residuals (i.e. $X_t - \hat{m_t}, X_t - \hat{S_t}$ )

Let $W_i \overset{\text{iid}}{\sim}N(\mu, \sigma^2)$ , then we have $W_i - \mu \sim N(0, \sigma^2)$ . Now we can estimate $\mu$ to remove the signal, and also we can estimate $\sigma^2$ .

Stationary process (series)

Let $\{X_s\}_{s=0, 1, ..., n}$ has the same properties as $\{X_{t+s}\}_{s=0, 1, ..., n}$ . (Note that we will focus on first and second order moments). iid noise is a special case of a stationary process.

Def. $X_t$ is weakly stationary if

$E(X_t)=\mu_X(t)$ is independent of $t$
$Cov(X_r, X_s)=E((X_r-\mu_X(r))(X_s-\mu_X(s)))=\gamma_X(r,s)$ , where $\gamma_X$ is the covariance function of $X_t$

We require that $\gamma_X(t+h, t)$ is independent of $t$ (i.e. $\gamma_X(t+h,t)=\gamma_X(h,0)=Cov(X_h, X_0)$ )

Def. For strongly stationary, we require that the joint distribution of $\{X_s\}_{s=0, 1, ..., n}$ is the same as $\{X_{t+s}\}_{s=0, 1, ..., n}$

We define $\gamma_X(h,0)=\gamma_X(h)$ is the auto-covariance function of a stationary series of lag $h$ .

We define $\rho_X(h)$ is the auto-correlation function of lag $h$ and $\rho_X(h)=\frac{\gamma_X(h)}{\gamma_X(0)}=Cor(X_{t+h},X_t)$

Useful identity

If $E(X^2) < \infty, E(Y^2) < \infty, E(Z^2) < \infty$ and $a,b,c$ are real constants, then $Cov(aX+bY+c, Z) = aCov(X,Z) + bCov(Y,Z)$ .

Example 1: iid noise

$X_t \overset{\text{iid}}{\sim}N(0, \sigma^2)$

By definition we have $E(X_t)=0$ . If $E(X^2) =\sigma^2 < \infty$ , then $\gamma_X(h)=Cov(X_{t+h}, X_t) = \begin{cases} \sigma^2, \text{ if } h=0 \\ 0, \forall h \neq 0 \text{ by independence} \end{cases}$

Therefore iid noise process is weakly stationary.

Example 2: White Noise Process

If $\{X_t\}$ is a sequence of uncorrelated random variables with $E(X_t)=0, Var(X_t)=\sigma^2 < \infty, \gamma_X(h)=0 \quad \forall h\neq 0$ , then we refer to it as white noise.

Note that iid noise is white noise, but white noise is not necessarily iid noise.

Example 3

Suppose $\{W_t\}$ and $\{Z_t\}$ are iid sequences, and $\{W_t\} \bot \{Z_t\}$ .

Let $\{W_t\}$ follows a Bernoulli distribution, where $Pr(W_i=0)=Pr(W_i=1)=1/2$ .

Let $\{Z_t\}$ follows a transformed Bernoulli distribution, where $Pr(W_i=-1)=Pr(W_i=1)=1/2$ .

Set $X_t=W_t(1-W_{t-1})Z_t$ , and we have the value table of $X_t$ as follows:

$W_{t-1}$	$W_t$	$X_t$
1	0	0
1	1	0
0	0	0
0	1	$Z_t$

$E(X_t)=E(W_t)E(W_{t-1})E(Z_t)=\frac{1}{2} \times \frac{1}{2} \times 0 = 0$

When calculating covariance, there are two cases:

$h=0$

$Cov(X_t,X_{t+h})=E(X_t X_{t+h})=E(W_t^2(1-W_{t-1})^2Z_t^2)\\=E(W_t^2)E((1-W_{t-1})^2)E(Z_t^2)=\frac{1}{2} \times \frac{1}{2} \times 1 = \frac{1}{4}$

$h\neq 0$

$Cov(X_t,X_{t+h})=E(X_t X_{t+h})=E(W_t(1-W_{t-1})Z_tW_{t+h}(1-W_{t+h-1})Z_{t+h})\\=E(W_t)E((1-W_{t-1}))E(Z_t)E(W_{t+h})E((1-W_{t+h-1}))E(Z_{t+h})=0$

Therefore, $X_t$ is a white noise process.

Note that $X_t$ and $X_{t-1}$ are dependent but not correlated.

Example 4: Random walk

Let $\{X_t\}$ be iid noise, and $S_t = X_1 + ... + X_t = \sum^t_{i=1} X_i$ .

We have $E(S_t)=0$ , and $Var(S_t)=t\sigma^2$ .

$Cov(S_{t+h}, S_t)=Cov(S_t+[X_{t+1}+...+X_{t+h}],S_t) \\= Cov(S_t,S_t) + Cov(X_{t+1}+...+X_{t+h},S_t) = t\sigma^2 + 0$

Therefore, random walk is not stationary.

Example 5: First order moving average process (MA(1))

Let $Z_t \sim WN(0, \sigma^2)$ . Let $X_t=Z_t + \theta Z_{t-1},\quad t=0,\pm1,\pm2,...$ where $\theta$ is a real-valued constant.

(Graphical representation will be added later)

We have $E(X_t)=E(Z_t) + \theta E(Z_{t-1}) = 0$ .

$Var(X_t) = E(X_t^2) = E((Z_t + \theta Z_{t-1})^2) \\=E(Z_t^2) + 2\theta E(Z_tZ_{t-1}) + \theta^2E(Z_{t-1}^2) = (1+\theta^2)\sigma^2$

When calculating covariance, there are three cases:

$h=0$

$\gamma_X(t+h,t)=E(X_{t+h}X_t)=E(X_t^2)=(1+\theta^2)\sigma^2$

$h=\pm1$

$\gamma_X(t+h,t)=E(X_{t+h}X_t)=E((Z_{t+1}+\theta Z_t)(Z_t+\theta Z_{t-1}))\\=E(Z_{t+1}Z_t)+\theta E(Z_t^2)+\theta E(Z_{t+1}Z_{t-1})+\theta^2 E(Z_tZ_{t-1})=\theta\sigma^2$

$|h|>1$

$\gamma_X(t+h,t)=E(X_{t+h}X_t)=E((Z_{t+h}+\theta Z_{t+h-1})(Z_t+\theta Z_{t-1}))=0$ becase $t\neq t-1 \neq t+h \neq t+h-1$ if $|h|>1$

Therefore, MA(1) is stationary, and $\rho_X(h)=\begin{cases} 1, \quad h=0 \\ \frac{\theta}{1+\theta^2}, \quad h=\pm1 \\ 0, \quad |h|>1 \end{cases}$